Skip to content

perf: switch saturation tests to static/1KB logs and reduce loadgen to SUT ratio to 2:1#2643

Draft
cijothomas wants to merge 4 commits into
open-telemetry:mainfrom
cijothomas:cijothomas/saturation-static
Draft

perf: switch saturation tests to static/1KB logs and reduce loadgen to SUT ratio to 2:1#2643
cijothomas wants to merge 4 commits into
open-telemetry:mainfrom
cijothomas:cijothomas/saturation-static

Conversation

@cijothomas
Copy link
Copy Markdown
Member

@cijothomas cijothomas commented Apr 13, 2026

Change Summary

  • Switch saturation tests to static data source with 1KB log bodies (pre_generated strategy) for more realistic payload sizes compared to tiny semantic-convention bodies.
  • Reduce loadgen:engine core ratio from 3:1 to 2:1 across all saturation configs — 2 loadgen cores are sufficient to saturate the engine, freeing resources for a new 32-core test.
  • Add 32-core saturation test — now possible since fewer total cores are needed.
  • Improve traffic generator body entropy — the body pool now generates 512 unique entries by cycling through different log templates at different offsets with a deterministic seq=N prefix. This produces realistic per-body compression ratios (~3:1) instead of the previous approach that repeated a single template ~7× per body (~7:1).
  • Make data_source and log_body_size_bytes templatable in loadgen config with backward-compatible defaults.
  • Existing non-saturation tests are unaffected (defaults unchanged).

How are these changes tested?

Validated locally on 16-core machine:

Test Engine CPU (normalized) Logs/sec Exit
1-core (2:1 ratio) 93.2% 176,037 0
2-core (2:1 ratio) 91.3% 277,197 0

Both tests confirm the engine is saturated with the 2:1 ratio. Full orchestrator reports generated successfully (shutdown + SQL report bugs that existed on the old branch were resolved by merging latest main).

Are there any user-facing changes?

No. Internal perf tests and traffic generator only.

@cijothomas cijothomas requested a review from a team as a code owner April 13, 2026 16:36
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.03%. Comparing base (b645a26) to head (c147027).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2643      +/-   ##
==========================================
- Coverage   86.03%   86.03%   -0.01%     
==========================================
  Files         720      720              
  Lines      273264   273270       +6     
==========================================
  Hits       235095   235095              
- Misses      37645    37651       +6     
  Partials      524      524              
Components Coverage Δ
otap-dataflow 87.19% <100.00%> (-0.01%) ⬇️
query_abstraction 80.61% <ø> (ø)
query_engine 89.57% <ø> (ø)
otel-arrow-go 52.45% <ø> (ø)
quiver 92.25% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…atio to 2:1

- Switch saturation tests to use static data source with 1KB log bodies
  (pre_generated strategy) for more realistic payload sizes
- Reduce loadgen:engine core ratio from 3:1 to 2:1 across all saturation configs
- Make data_source and log_body_size_bytes templatable in loadgen config
- Existing non-saturation tests are unaffected (defaults unchanged)

Validated locally on 16-core machine (4 runs each, 1/2/4 engine cores):
- 2:1 ratio saturates engine to 96-100% CPU at 1-2 cores
- 33% fewer loadgen cores needed
- 3.75x higher data throughput (514 MB/s vs 137 MB/s at 2 cores)
@cijothomas cijothomas force-pushed the cijothomas/saturation-static branch from 399642f to 6354a14 Compare April 13, 2026 16:41
@cijothomas cijothomas marked this pull request as draft April 13, 2026 20:56
…late

The saturation template sets data_source=static and log_body_size_bytes=1024,
but these variables were not forwarded through the test steps template to the
loadgen config template. Add the passthrough so the loadgen actually uses
static/1KB logs as intended.
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to lack of recent activity. It will be closed in 30 days if no further activity occurs. If this PR is still relevant, please comment or push new commits to keep it active.

@github-actions github-actions Bot added stale Not actively pursued and removed stale Not actively pursued labels Apr 28, 2026
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to lack of recent activity. It will be closed in 30 days if no further activity occurs. If this PR is still relevant, please comment or push new commits to keep it active.

@github-actions github-actions Bot added the stale Not actively pursued label May 13, 2026
Resolve conflict in df-loadgen-steps-docker.yaml: keep both data_source/log_body_size_bytes
from this branch and exporter_extra_config from main.
The build_body_pool function previously padded log bodies by repeating the
SAME template text to reach the target size. For 1KB bodies, a ~150-char
template was repeated ~7x, creating unrealistically compressible content
(7.4:1 per body vs ~1.5:1 for real logs).

Changes:
- Cycle through DIFFERENT templates to fill each body instead of repeating one
- Add deterministic 'seq=NNNN' prefix so no two pool entries are identical
- Expand pool from 50 to 512 entries (matches default batch_size) to avoid
  duplicate bodies within a batch inflating cross-record compression

Per-body compression improves from 7.4:1 to 3.1:1 (much closer to real
structured log text). Batch-level compression stays at ~18.7:1 which is
realistic for OTLP batches with shared schema/keys.
@github-actions github-actions Bot added the rust Pull requests that update Rust code label May 13, 2026
@cijothomas cijothomas marked this pull request as ready for review May 13, 2026 13:51
@cijothomas cijothomas marked this pull request as draft May 13, 2026 13:52
@github-actions github-actions Bot removed the stale Not actively pursued label May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants